170 research outputs found

    Java-ML: a machine learning library

    Get PDF
    Java-ML is a collection of machine learning and data mining algorithms, which aims to be a readily usable and easily extensible API for both software developers and research scientists. The interfaces for each type of algorithm are kept simple and algorithms strictly follow their respective interface. Comparing different classifiers or clustering algorithms is therefore straightforward, and implementing new algorithms is also easy. The implementations of the algorithms are clearly written, properly documented and can thus be used as a reference. The library is written in Java and is available from http://java-ml.sourceforge.net/ under the GNU GPL license

    First records of alien crayfish of the Procambarus acutus species complex in Belgium

    Get PDF
    We present the first Belgian records of potentially invasive alien crayfish of the Procambarus acutus species complex, including the first confirmed record of P. acutus acutus. The species complex was observed at four different sites in three provinces in the north of the country. Only at one site the presence of a form I male specimen made identification to species level possible, based on gonopod morphology. The other three observations are considered as belonging to the P. acutus species complex. Procambarus acutus acutus is the fifth alien crayfish species known to Belgium. In Europe, it was previously only known as an established alien species from the Netherlands and Great Britain

    GenomeView : a next-generation genome browser

    Get PDF
    Due to ongoing advances in sequencing technologies, billions of nucleotide sequences are now produced on a daily basis. A major challenge is to visualize these data for further downstream analysis. To this end, we present GenomeView, a stand-alone genome browser specifically designed to visualize and manipulate a multitude of genomics data. GenomeView enables users to dynamically browse high volumes of aligned short-read data, with dynamic navigation and semantic zooming, from the whole genome level to the single nucleotide. At the same time, the tool enables visualization of whole genome alignments of dozens of genomes relative to a reference sequence. GenomeView is unique in its capability to interactively handle huge data sets consisting of tens of aligned genomes, thousands of annotation features and millions of mapped short reads both as viewer and editor. GenomeView is freely available as an open source software package

    Toward a gold standard for promoter prediction evaluation

    Get PDF
    Motivation: Promoter prediction is an important task in genome annotation projects, and during the past years many new promoter prediction programs (PPPs) have emerged. However, many of these programs are compared inadequately to other programs. In most cases, only a small portion of the genome is used to evaluate the program, which is not a realistic setting for whole genome annotation projects. In addition, a common evaluation design to properly compare PPPs is still lacking

    Highlights of the BioTM 2010 workshop on advances in bio text mining

    Get PDF
    This meeting report gives an overview of the keynote lectures, the panel discussion and a selection of the contributed presentations. The workshop was held in Gent, Belgium on May 10-11. It featured a tutorial aimed towards a broad audience of (computational) biologists, (computational) linguists and researchers working purely on text mining

    Highlights from the 6th International Society for Computational Biology Student Council Symposium at the 18th Annual International Conference on Intelligent Systems for Molecular Biology

    Get PDF
    This meeting report gives an overview of the keynote lectures and a selection of the student oral and poster presentations at the 6th International Society for Computational Biology Student Council Symposium that was held as a precursor event to the annual international conference on Intelligent Systems for Molecular Biology (ISMB). The symposium was held in Boston, MA, USA on July 9th, 2010

    Populations of latent Mycobacterium tuberculosis lack a cell wall: Isolation, visualization, and whole-genome characterization

    Get PDF
    AbstractObjective/BackgroundMycobacterium tuberculosis (MTB) causes active tuberculosis (TB) in only a small percentage of infected people. In most cases, the infection is clinically latent, where bacilli can persist in human hosts for years without causing disease. Surprisingly, the biology of such persister cells is largely unknown. This study describes the isolation, identification, and whole-genome sequencing (WGS) of latent TB bacilli after 782days (26months) of latency (the ability of MTB bacilli to lie persistent).MethodsThe in vitro double-stress model of latency (oxygen and nutrition) was designed for MTB culture. After 26months of latency, MTB cells that persisted were isolated and investigated under light and atomic force microscopy. Spoligotyping and WGS were performed to verify the identity of the strain.ResultsWe established a culture medium in which MTB bacilli arrest their growth, reduce their size (0.3–0.1μm), lose their acid fastness (85–90%) and change their shape. Spoligopatterns of latent cells were identical to original H37Rv, with differences observed at spacers two and 14. WGS revealed only a few genetic changes relative to the already published H37Rv reference genome. Among these was a large 2064-bp insertion (RvD6), which was originally detected in both H37Ra and CDC1551, but not H37Rv.ConclusionHere, we show cell-wall free cells of MTB bacilli in their latent state, and the biological adaptation of these cells was more phenotypic in nature than genomic. These cell-wall free cells represent a good model for understanding the nature of TB latency

    DiProDB: a database for dinucleotide properties

    Get PDF
    DiProDB (http://diprodb.fli-leibniz.de) is a database of conformational and thermodynamic dinucleotide properties. It includes datasets both for DNA and RNA, as well as for single and double strands. The data have been shown to be important for understanding different aspects of nucleic acid structure and function, and they can also be used for encoding nucleic acid sequences. The database is intended to facilitate further applications of dinucleotide properties. A number of property datasets is highly correlated. Therefore, the database comes with a correlation analysis facility. Authors having determined new sets of dinucleotide property values are invited to submit these data to DiProDB

    Translation initiation site prediction on a genomic scale : beauty in simplicity

    Get PDF
    Motivation: The correct identification of translation initiation sites (TIS) remains a challenging problem for computational methods that automatically try to solve this problem. Furthermore, the lion's share of these computational techniques focuses on the identification of TIS in transcript data. However, in the gene prediction context the identification of TIS occurs on the genomic level, which makes things even harder because at the genome level many more pseudo-TIS occur, resulting in models that achieve a higher number of false positive predictions. Results: In this article, we evaluate the performance of several 'simple' TIS recognition methods at the genomic level, and compare them to state-of-the-art models for TIS prediction in transcript data. We conclude that the simple methods largely outperform the complex ones at the genomic scale, and we propose a new model for TIS recognition at the genome level that combines the strengths of these simple models. The new model obtains a false positive rate of 0.125 at a sensitivity of 0.80 on a well annotated human chromosome ( chromosome 21). Detailed analyses show that the model is useful, both on its own and in a simple gene prediction setting
    corecore